---
title: "Example: Contextual Recall | Evals | Kastrax Docs"
description: Example of using the Contextual Recall metric to evaluate how well responses incorporate context information.
---

import { GithubLink } from "@/components/github-link";

# Contextual Recall ✅

This example demonstrates how to use Kastrax's Contextual Recall metric to evaluate how effectively responses incorporate information from provided context.

## Overview ✅

The example shows how to:

1. Configure the Contextual Recall metric
2. Evaluate context incorporation
3. Analyze recall scores
4. Handle different recall levels

## Setup ✅

### Environment Setup

Make sure to set up your environment variables:

```bash filename=".env"
OPENAI_API_KEY=your_api_key_here
```

### Dependencies

Import the necessary dependencies:

```typescript copy showLineNumbers filename="src/index.ts"
import { openai } from '@ai-sdk/openai';
import { ContextualRecallMetric } from '@kastrax/evals/llm';
```

## Example Usage ✅

### High Recall Example

Evaluate a response that includes all context information:

```typescript copy showLineNumbers{5} filename="src/index.ts"
const context1 = [
  'Product features include cloud sync.',
  'Offline mode is available.',
  'Supports multiple devices.',
];

const metric1 = new ContextualRecallMetric(openai('gpt-4o-mini'), {
  context: context1,
});

const query1 = 'What are the key features of the product?';
const response1 = 'The product features cloud synchronization, offline mode support, and the ability to work across multiple devices.';

console.log('Example 1 - High Recall:');
console.log('Context:', context1);
console.log('Query:', query1);
console.log('Response:', response1);

const result1 = await metric1.measure(query1, response1);
console.log('Metric Result:', {
  score: result1.score,
  reason: result1.info.reason,
});
// Example Output:
// Metric Result: { score: 1, reason: 'All elements of the output are supported by the context.' }
```

### Mixed Recall Example

Evaluate a response that includes some context information:

```typescript copy showLineNumbers{27} filename="src/index.ts"
const context2 = [
  'Python is a high-level programming language.',
  'Python emphasizes code readability.',
  'Python supports multiple programming paradigms.',
  'Python is widely used in data science.',
];

const metric2 = new ContextualRecallMetric(openai('gpt-4o-mini'), {
  context: context2,
});

const query2 = 'What are Python\'s key characteristics?';
const response2 = 'Python is a high-level programming language. It is also a type of snake.';

console.log('Example 2 - Mixed Recall:');
console.log('Context:', context2);
console.log('Query:', query2);
console.log('Response:', response2);

const result2 = await metric2.measure(query2, response2);
console.log('Metric Result:', {
  score: result2.score,
  reason: result2.info.reason,
});
// Example Output:
// Metric Result: { score: 0.5, reason: 'Only half of the output is supported by the context.' }
```

### Low Recall Example

Evaluate a response that misses most context information:

```typescript copy showLineNumbers{53} filename="src/index.ts"
const context3 = [
  'The solar system has eight planets.',
  'Mercury is closest to the Sun.',
  'Venus is the hottest planet.',
  'Mars is called the Red Planet.',
];

const metric3 = new ContextualRecallMetric(openai('gpt-4o-mini'), {
  context: context3,
});

const query3 = 'Tell me about the solar system.';
const response3 = 'Jupiter is the largest planet in the solar system.';

console.log('Example 3 - Low Recall:');
console.log('Context:', context3);
console.log('Query:', query3);
console.log('Response:', response3);

const result3 = await metric3.measure(query3, response3);
console.log('Metric Result:', {
  score: result3.score,
  reason: result3.info.reason,
});
// Example Output:
// Metric Result: { score: 0, reason: 'None of the output is supported by the context.' }
```

## Understanding the Results ✅

The metric provides:

1. A recall score between 0 and 1:
   - 1.0: Perfect recall - all context information used
   - 0.7-0.9: High recall - most context information used
   - 0.4-0.6: Mixed recall - some context information used
   - 0.1-0.3: Low recall - little context information used
   - 0.0: No recall - no context information used

2. Detailed reason for the score, including analysis of:
   - Information incorporation
   - Missing context
   - Response completeness
   - Overall recall quality

<br />
<br />
<hr className="dark:border-[#404040] border-gray-300" />
<br />
<br />
<GithubLink
  link={
    "https://github.com/kastrax-ai/kastrax/blob/main/examples/basics/evals/contextual-recall"
  }
/>
